The Multimedia Lexicon: Automatic object and structure discovery in audio-video-text content
نویسندگان
چکیده
We understand the real world in terms of objects: When we look at a picture or our surroundings, or when we close our eyes and listen, we organize the information arriving from the outside world in terms of attributes for perceived, independent objects. “Independence” can refer to physical separability (the pen on my desk), behavioral autonomy (the hissing steam valve on the radiator), or more conceptual distinctions, but it is something that human observers can largely agree upon.
منابع مشابه
Automatic Movie Abstracting and Its Presentation on an HTML-Page
Presented is an algorithm for automatic production of a video abstract of a feature film, similar to a movie trailer. It selects clips from the original movie based on detection of special events like dialogs, shots, explosions and text occurrences, and on general action indicators applied to scenes. These clips are then assembled to form a video trailer using a model of editing. Additional cli...
متن کاملREIHE INFORMATIK 3 / 97 Automatic Movie Abstracting
Presented is an algorithm for automatic production of a video abstract of a feature film, similar to a movie trailer. It selects clips from the original movie based on detection of special events like dialogs, shots, explosions and text occurrences, and on general action indicators applied to scenes. These clips are then assembled to form a video trailer using a model of editing. Additional cli...
متن کاملAutomatic Annotation of Formula 1 Races for Content-Based Video Retrieval
Content-based video retrieval is emerging as an important part in the process of utilization of various multimedia documents. In this report we present a novel system for the automatic indexing and content-based retrieval of multimedia documents. We chose the domain of Formula 1 sport videos because the manual annotation of Formula 1 races is complicated and time consuming. Our system uses mult...
متن کاملSemantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues
We present a learning-based approach to the semantic indexing of multimedia content using cues derived from audio, visual, and text features. We approach the problem by developing a set of statistical models for a predefined lexicon. Novel concepts are then mapped in terms of the concepts in the lexicon. To achieve robust detection of concepts, we exploit features from multiple modalities, name...
متن کاملSemantic Multi-modal Analysis, Structuring, and Visualization for Candid Personal Interaction Videos
Videos are rich in multimedia content and semantics, which should be used by video browsers to better present the audio-visual information to the viewer. Ubiquitous video players allow for content to be scanned linearly, rarely providing summaries or methods for searching. Through analysis of audio and video tracks, it is possible to extract text transcripts from audio, displayed text from vide...
متن کامل